Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update OS to Ubuntu 24.04 and ROOT to 6.32.04 #103

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

tvami
Copy link
Member

@tvami tvami commented Sep 24, 2024

I am adding a new package to the container, here are the details.

Resolves #102

What new packages does this PR add to the development container?

  • I included ROOT via an image officially from ROOT
  • That image already has 24.04 Ubuntu
  • I removed Pythia6 and GENIE, given the plans to move to Pythia8 and a new version of GENIE

Check List

  • I successfully built the container using docker
docker build -t  my-ldmx-sw-new .

works nicely.

  • I was able to build ldmx-sw using this new container build
just init
just use my-ldmx-sw-new:latest
just compile 

This fails, but with the problem already known in LDMX-Software/ldmx-sw#1470
Accept that now even with the minimal compiler settings it fails.
--> So I'm blocked here

  • I was able to test run a small simulation and reconstruction inside this container
# outline of test instructions
cd $LDMX_BASE/ldmx-sw/build
ldmx ctest
cd ..
for c in `ls ldmx-sw/*/test/*.py`; ldmx fire $c; done
  • I was able to successfully use the new packages. Explain what you did to test them below:

@tvami
Copy link
Member Author

tvami commented Sep 26, 2024

Now that the TS stuff is fixed, I'm looking at this again. I see in the log that

#29 ERROR: failed to push ldmx/dev: Canceled: grpc: the client connection is closing
------
 > exporting to image:
------
ERROR: failed to solve: Canceled: failed to push ldmx/dev: Canceled: grpc: the client connection is closing

I dont know what grpc msg is supposed to mean. @tomeichlersmith ?

@tomeichlersmith
Copy link
Member

I have never seen this before either. It may be caused by our custom runner at UMN shutting down? I'm not sure, you can try to see if it works after trying again.

@tvami tvami marked this pull request as ready for review September 26, 2024 21:27
@tvami
Copy link
Member Author

tvami commented Sep 26, 2024

Ah I guess so
"The self-hosted runner: runner-0 lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error."
from here
https://github.com/LDMX-Software/docker/actions/runs/11019831593

Anyway, I'll re-try again

@tvami
Copy link
Member Author

tvami commented Sep 27, 2024

Screenshot 2024-09-26 at 17 07 13

cool the build does seem to work.
As for testing with older version of ldmx-sw, I didnt really think about that. If it works, all good, but if it doesnt, can we somehow say that some older version should be used with some older version of the docker image too?

@tomeichlersmith
Copy link
Member

tomeichlersmith commented Sep 27, 2024

There is precedence for not supporting older ldmx-sw versions with newer development container images; however, this is not enforced anywhere. Instead it is simply documented online: https://ldmx-software.github.io/developing/compatibility.html

We could do something similar here. Cut a new major version of the development image and point out that you need to use a newer ldmx-sw with the newer development image.

One thing I do want to point out though is that the test of trunk is failing as well, so it may be an issue with the CI or image itself and not with the older ldmx-sw versions.

Edit: nevermind, it looks like ldmx-sw just needs some patches after we update the compiler. This makes sense and has been required before. Similar to the linked documentation online, I'd like there to be some detail on why older ldmx-sw versions can't be compiled just in case someone wants to go backward.

@tomeichlersmith
Copy link
Member

I thought about this a bit over the weekend and I have two more thoughts.

I would really like to keep the CI testing back to v4.0 at least, so I would request that you focus on figuring out the necessary patch for v4.0.0 (the easiest case is that its the same patch necessary to get trunk working in which case we just need to automate it).

I don't like how I originally implemented by backport solution (basically just using sed to change the code https://github.com/LDMX-Software/docker/blob/main/.github/interop/force-legacy-onnx.sh). I think a patch file is more clear to potential readers, potentially more stable, and easier to implement (just go to that tag, hack until it compiles and runs, and then git diff > version.patch). The main potential issue with patch files is that they could interact oddly with the git submodules we used to have, but I think that can be easily avoided by simply using diff and patch rather than git diff and git apply since the non-git commands just use filepaths which have stayed mostly consistent. I don't expect this to be implemented here unless you find switching to this patch-file solution is easier than developing another sed script to patch previous ldmx-sw versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update OS to Ubuntu 24.04
2 participants